But I’m not qualified to be a data scientist!
How to transition from university to the data science industry
Erika Braithwaite
2022-03-10
Welcome!
So you’re thinking about leaving the comforting womb of university, and transitioning to a data science career in industry? You may be feeling….
- Overwhelmed by the possibilities
- Confused about the terminology
- Insecure about your capacities
- And so many more emotions
Why
Objectives
I’ll do my best to address these objectives in the most data-driven approach I can!
- Dispel some myths about the data science industry
- Tips on how to transition
- Let’s talk about your CV, the job search and the interview
- New directions in the data science industry
- Bestow some glimmer of hope ⭐
A bit about me
I’m the CEO of a health data science startup in Montreal www.precision-analytics.ca

My background
- BA Psychology, Concordia
- MSc Psychology, Université de Montréal
- PhD Epidemiology and Biostatistics, McGill
Our company
- Founded in 2017
- Began as a statistical consultancy, evolved to a blend of Software as a Service
- Work in the biopharmaceutical sector to create tools that combine data storage, analysis and communication

Our story
What is a data scientist
A combination of programming, statistics and domain knowledge.
I like to think of it as story telling using data

A very scary (and real) job posting
- Coding knowledge and experience with several languages: C, C++, Java, JavaScript, etc.
- Knowledge and experience in statistical and data mining techniques: GLM/Regression, Random Forest, Boosting, Trees, text mining, social network analysis, etc.
- Experience querying databases and using statistical computer languages: R, Python, SLQ, etc.
- Experience using web services: Redshift, S3, Spark, Digital Ocean, etc.
- Experience creating and using advanced machine learning algorithms and statistics: regression, simulation, scenario analysis, modeling, clustering, decision trees, neural networks, etc.
- Experience analyzing data from 3rd party providers: Google Analytics, Site Catalyst, Coremetrics, Adwords, Crimson Hexagon, Facebook Insights, etc.
- Experience with distributed data/computing tools: Map/Reduce, Hadoop, Hive, Spark, Gurobi, MySQL, etc.
- Experience visualizing/presenting data for stakeholders using: Periscope, Business Objects, D3, ggplot, etc.
Myths about data science
- Myth 1. You need a PhD to become a data scientist
- Myth 2. All data scientists use Python
- Myth 3. Data science == AI & machine learning
- Myth 4. Data science is all about technical skills
Letting the data speak!
To tackle these myths, Kaggle conducts an annual survey of data scientists. Over 25,000 respondents completed the survey in 2021
‼️ Kaggle is a platform that hosts machine learning competitions. It’s users are not representative of the entire data science community ‼️
Data available for 2021, along with survey and results.
Demographics of respondents

Types of data scientist positions

Myth 1. You need a PhD to become a data scientist

Experience of respondents

Myth 3. Data science == Machine learning and AI

Myth 4. Data science challenges are mostly analytical
The survey did not explicitly ask the way respondents broke up their time. So I did the next best thing… ask twitter

Data scientists challenges
In 2017, the Kaggle survey asked respondents about the “biggest challenges in data science”.
From here we can see that most people identify “dirty data” the toughest part of the job. The rest of the issues seem be organizational.

On transitioning
On transitioning
Understand your new “audience”
💰 The currency in academia versus the currency in the private sector 💰

Gaining more experience: Courses
The American Statistical Association held a two-day Data Science summit. 72 educators, researchers and practitioners in statistics, mathematics, computer science, and data science from academia, industry, government and nonprofit gathered to put forth recommendations for future data science programs.
Data science courses should expose students to:
|
Introduction to statistics
|
|
Data analysis in the real world
|
|
Math and algorithms
|
|
Answering real problems
|
|
Expose students to modern tools
|
|
Teach data ethics
|
|
Active learning
|
I’d add to this list: study design. Knowing how data was generated/collected will always inform the types of questions you can ask, and appropriate method of analysis.
Build a portfolio
Showcase your work
Internship
- Get relevant experience, learn about new industry, create a network
- Check out your university coop programs
- Internship programs in industry IVADO labs, Plotly
- Think Tanks Rand opportunities and here
Don’t forget to reach out to your network and tell people you’re looking to get into a new industry
Preparing your cover letter and CV
- Use the cover letter to explain why you’re applying for the job even if your profile doesn’t perfectly match the job description
- Instead of focusing on all the highlights of your academic career, explain WHY you want to work for THAT company
- In general, keep it short and sweet (max 2 pages)
- Academic CV != Professional CV
- Only provide extensive details about academic experience if it is relevant
- Tailor your CV and cover letter to each application - use key words from posting
More on CV’s
A few tips (based on my personal pet peeves)
- Use consistent formatting
- If using a job aggregator site, upload your actual CV (the printout we see is very ugly)
- Don’t include abbreviations or technical jargon
- Don’t indicate that you have high proficiency in 22 programming languages
- Don’t list 30 MOOC’s I know you didn’t finish
Searching for a position

- Focus on the skills listed in the job positing, not the title but be weary of descriptions
- Widen your search beyond “data scientist” - many analogous positions exist with different titles: analyst, bio/statistician, health economist, R/SAS/Python programmer, technical research assistant, quantitative scientist, consultant ect.
- Don’t expect that you will possess 100% of the stated skills; it’s often a wishlist
- Try to talk to people (ideally former employees) to get insider perspective
- Be aware that many companies are jumping on the “data” bandwagon without having any infrastructure to support a junior hire
During the interview
- Expect technical questions and/or coding challenge
- Be honest about your skills
- Be mindful about expressing your desired to be paid to learn
- Highlight the value you bring to the employer, rather than the value you accrued in university
- Be ready to show that you are able to explain technical concepts to non-technical audiences
Jacqueline Nolis’s book Build a Career in Data Science
“I want to hear about a project they’ve worked on recently. I ask them about how the project started, how they determined it was worth time and effort, their process, and their results. I also ask them about what they learned from the project. I gain a lot from answers to this question: if they can tell a narrative, how the problem related to the bigger picture, and how they tackled the hard work of doing something”.
Asking for feedback after your interview can help highlight the interviewer’s perceptions of your strengths and weaknesses
Employer fears
In general, employers are concerned that new hires without job experience will struggle to see the bigger picture and keep up with the pace.
|
Undergrad & Masters
|
PhD
|
|
No revelant practical experience
|
Experience is hyperfocused
|
|
Will expect a lot of hand holding
|
Unaccustomed to working collaboratively in multidisplinary teams
|
|
Training is expensive and time consuming
|
Both under- and over-qualified
|
|
New grads tend to get trained and then leave
|
Understimulated by day-to-day
|
Try to address these fears explicitly in your cover letter, CV and/or during the interview
Conclusion

Pivoting to data science from humanities, social sciences or any other disciplines doesn’t mean you need to leave all of your training behind.
✔️ You have knowledge and/or experience in at least one of these circles
☑️ Look for people in your substantive area who are doing some more quantitative/tech driven work
☑️ Be confident that you know how to learn
☑️ Be ready to rebrand yourself
✔️ Be ready to be creative in your search and to be a hustler
☑️ Your path is the right path
New directions (my hot takes)
- Large push towards “automating” analytics is set to change the role of data scientists (e.g. Data Robot, Alteryx, Dataiku, Google Cloud AI); storytelling and substantive expertise are the last aspects of data and communication that cannot be productized
- Many organizations want to innovate but are not sure how; many ML investments have failed to deliver value
- In the healthcare sector, billions invested in AI for drug discovery but not a single drug has gone to market
Happy to answer any questions